AITopics

2501.04828

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
(18 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-12-2023

Divorce Prediction with Machine Learning: Insights and LIME Interpretability

Ahsan, Md Manjurul

Divorce is one of the most common social issues in developed countries like in the United States. Almost 50% of the recent marriages turn into an involuntary divorce or separation. While it is evident that people vary to a different extent, and even over time, an incident like Divorce does not interrupt the individual's daily activities; still, Divorce has a severe effect on the individual's mental health, and personal life. Within the scope of this research, the divorce prediction was carried out by evaluating a dataset named by the 'divorce predictor dataset' to correctly classify between married and Divorce people using six different machine learning algorithms- Logistic Regression (LR), Linear Discriminant Analysis (LDA), K-Nearest Neighbors (KNN), Classification and Regression Trees (CART), Gaussian Na\"ive Bayes (NB), and, Support Vector Machines (SVM). Preliminary computational results show that algorithms such as SVM, KNN, and LDA, can perform that task with an accuracy of 98.57%. This work's additional novel contribution is the detailed and comprehensive explanation of prediction probabilities using Local Interpretable Model-Agnostic Explanations (LIME). Utilizing LIME to analyze test results illustrates the possibility of differentiating between divorced and married couples. Finally, we have developed a divorce predictor app considering ten most important features that potentially affect couples in making decisions in their divorce, such tools can be used by any one in order to identify their relationship condition.

accuracy, dataset, divorce, (11 more...)

2310.0862

Country:

North America > United States > Oklahoma > Cleveland County > Norman (0.14)
Asia > Middle East > Republic of Türkiye > Nevsehir Province > Nevsehir (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.56)

arXiv.org Artificial IntelligenceApr-15-2023

Explanations of Black-Box Models based on Directional Feature Interactions

Masoomi, Aria, Hill, Davin, Xu, Zhonghui, Hersh, Craig P, Silverman, Edwin K., Castaldi, Peter J., Ioannidis, Stratis, Dy, Jennifer

As machine learning algorithms are deployed ubiquitously to a variety of domains, it is imperative to make these often black-box models transparent. Several recent works explain black-box models by capturing the most influential features for prediction per instance; such explanation methods are univariate, as they characterize importance per feature. We extend univariate explanation to a higher-order; this enhances explainability, as bivariate methods can capture feature interactions in black-box models, represented as a directed graph. Analyzing this graph enables us to discover groups of features that are equally important (i.e., interchangeable), while the notion of directionality allows us to identify the most influential features. We apply our bivariate method on Shapley value explanations, and experimentally demonstrate the ability of directional explanations to discover feature interactions. We show the superiority of our method against state-of-the-art on CIFAR10, IMDB, Census, Divorce, Drug, and gene data.

artificial intelligence, machine learning, natural language, (20 more...)

2304.0767

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(10 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Air (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (0.93)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Merritt, Sean H., Christensen, Alexander P.

An Experimental Study of Dimension Reduction Methods on Machine Learning Algorithms with Applications to Psychometrics

arXiv.org Artificial IntelligenceMar-21-2023

Developing interpretable machine learning models has become an increasingly important issue. One way in which data scientists have been able to develop interpretable models has been to use dimension reduction techniques. In this paper, we examine several dimension reduction techniques including two recent approaches developed in the network psychometrics literature called exploratory graph analysis (EGA) and unique variable analysis (UVA). We compared EGA and UVA with two other dimension reduction techniques common in the machine learning literature (principal component analysis and independent component analysis) as well as no reduction to the variables real data. We show that EGA and UVA perform as well as the other reduction techniques or no reduction. Consistent with previous literature, we show that dimension reduction can decrease, increase, or provide the same accuracy as no reduction of variables. Our tentative results find that dimension reduction tends to lead to better performance when used for classification tasks.

artificial intelligence, machine learning, reduction, (16 more...)

2210.1323

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Tennessee > Davidson County > Nashville (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Information Technology (0.93)
Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (1.00)

Eyecioglu, Onder, Hangun, Batuhan, Kayisli, Korhan, Yesilbudak, Mehmet

Performance Comparison of Different Machine Learning Algorithms on the Prediction of Wind Turbine Power Generation

arXiv.org Artificial IntelligenceMay-11-2021

Over the past decade, wind energy has gained more attention in the world. However, owing to its indirectness and volatility properties, wind power penetration has increased the difficulty and complexity in dispatching and planning of electric power systems. Therefore, it is needed to make the high-precision wind power prediction in order to balance the electrical power. For this purpose, in this study, the prediction performance of linear regression, k-nearest neighbor regression and decision tree regression algorithms is compared in detail. k-nearest neighbor regression algorithm provides lower coefficient of determination values, while decision tree regression algorithm produces lower mean absolute error values. In addition, the meteorological parameters of wind speed, wind direction, barometric pressure and air temperature are evaluated in terms of their importance on the wind power parameter. The biggest importance factor is achieved by wind speed parameter. In consequence, many useful assessments are made for wind power predictions.

prediction, regression, renewable energy research, (13 more...)

doi: 10.1109/ICRERA47325.2019.8996541

2105.05197

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.06)
Europe > Romania > Centru Development Region > Brașov County > Brașov (0.06)
Asia > Middle East > Republic of Türkiye > Nevsehir Province > Nevsehir (0.05)
(7 more...)

Genre: Research Report > New Finding (0.36)

Industry: Energy > Renewable > Wind (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Watanabe, Chihiro, Suzuki, Taiji

Deep Two-Way Matrix Reordering for Relational Data Analysis

arXiv.org Machine LearningMay-10-2021

Matrix reordering is a task to permute the rows and columns of a given observed matrix such that the resulting reordered matrix shows meaningful or interpretable structural patterns. Most existing matrix reordering techniques share the common processes of extracting some feature representations from an observed matrix in a predefined manner, and applying matrix reordering based on it. However, in some practical cases, we do not always have prior knowledge about the structural pattern of an observed matrix. To address this problem, we propose a new matrix reordering method, called deep two-way matrix reordering (DeepTMR), using a neural network model. The trained network can automatically extract nonlinear row/column features from an observed matrix, which can then be used for matrix reordering. Moreover, the proposed DeepTMR provides the denoised mean matrix of a given observed matrix as an output of the trained network. This denoised mean matrix can be used to visualize the global structure of the reordered observed matrix. We demonstrate the effectiveness of the proposed DeepTMR by applying it to both synthetic and practical datasets.

deeptmr, matrix, reordered input matrix, (14 more...)

2103.14203

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.28)
Asia > Japan > Honshū > Kantō > Saitama Prefecture > Saitama (0.04)
Asia > Japan > Honshū > Kantō > Tochigi Prefecture (0.04)
(12 more...)

Genre:

Research Report (0.50)
Overview (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Freidling, Tobias, Poignard, Benjamin, Climente-González, Héctor, Yamada, Makoto

Post-selection inference with HSIC-Lasso

arXiv.org Machine LearningOct-29-2020

Detecting influential features in complex (non-linear and/or high-dimensional) datasets is key for extracting the relevant information. Most of the popular selection procedures, however, require assumptions on the underlying data - such as distributional ones -, which barely agree with empirical observations. Therefore, feature selection based on nonlinear methods, such as the model-free HSIC-Lasso, is a more relevant approach. In order to ensure valid inference among the chosen features, the selection procedure must be accounted for. In this paper, we propose selective inference with HSIC-Lasso using the framework of truncated Gaussians together with the polyhedral lemma. Based on these theoretical foundations, we develop an algorithm allowing for low computational costs and the treatment of the hyper-parameter selection issue. The relevance of our method is illustrated using artificial and real-world datasets. In particular, our empirical findings emphasise that type-I error control at the considered level can be achieved.

artificial intelligence, estimator, machine learning, (17 more...)

2010.15659

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.68)
Leisure & Entertainment > Sports > Golf (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Shahhosseini, Mohsen, Hu, Guiping

Improved Weighted Random Forest for Classification Problems

arXiv.org Machine LearningSep-1-2020

Several studies have shown that combining machine learning models in an appropriate way will introduce improvements in the individual predictions made by the base models. The key to make well-performing ensemble model is in the diversity of the base models. Of the most common solutions for introducing diversity into the decision trees are bagging and random forest. Bagging enhances the diversity by sampling with replacement and generating many training data sets, while random forest adds selecting a random number of features as well. This has made the random forest a winning candidate for many machine learning applications. However, assuming equal weights for all base decision trees does not seem reasonable as the randomization of sampling and input feature selection may lead to different levels of decision-making abilities across base decision trees. Therefore, we propose several algorithms that intend to modify the weighting strategy of regular random forest and consequently make better predictions. The designed weighting frameworks include optimal weighted random forest based on ac-curacy, optimal weighted random forest based on the area under the curve (AUC), performance-based weighted random forest, and several stacking-based weighted random forest models. The numerical results show that the proposed models are able to introduce significant improvements compared to regular random forest.

artificial intelligence, machine learning, random forest, (16 more...)

2009.00534

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Ho, Lam Si Tung, Dinh, Vu

Consistent feature selection for neural networks via Adaptive Group Lasso

arXiv.org Machine LearningJun-10-2020

One main obstacle for the wide use of deep learning in medical and engineering sciences is its interpretability. While neural network models are strong tools for making predictions, they often provide little information about which features play significant roles in influencing the prediction accuracy. To overcome this issue, many regularization procedures for learning with neural networks have been proposed for dropping non-significant features. Unfortunately, the lack of theoretical results casts doubt on the applicability of such pipelines. In this work, we propose and establish a theoretical guarantee for the use of the adaptive group lasso for selecting important features of neural networks. Specifically, we show that our feature selection method is consistent for single-output feed-forward neural networks with one hidden layer and hyperbolic tangent activation function. We demonstrate its applicability using both simulation and data analysis.

artificial intelligence, group lasso, machine learning, (16 more...)

2006.00334

Country:

North America > United States > Delaware > New Castle County > Newark (0.14)
North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)
Atlantic Ocean > Black Sea (0.04)
Asia > Middle East > Republic of Türkiye > Nevsehir Province > Nevsehir (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)